Three Experiments on Mining the Web for Ontology and Lexicon Learning

نویسندگان

  • Sergei Nirenburg
  • Donald Dimitroff
  • Craig Pfeifer
چکیده

This paper describes an approach to alleviating the well-known problem of the knowledge acquisition bottleneck in knowledgebased systems. In knowledge-based, meaning-oriented natural language processing, the core knowledge resources are a semantic lexicon and an ontological world model in terms of which lexical meaning is expressed. We describe a mutual bootstrapping approach whereby existing resources are used to create additional resources that are then added to the original resources. Thus, text understanding bootstraps the learning process , which in turn boosts the knowledge resources underlying text understanding. Specifically, in our experiments an existing ontology and an existing semantic lexicon are used by the ontologicalsemantic text analyzer OntoSem to analyze sentences mined from the web and containing specific words unknown to the system to generate candidates for ontological concepts and lexicon entries that at the time are not part of OntoSem’s static knowledge resources. The experiments described in the paper have as their goal a) empirical determination of the number of senses for a word; b) automatic creation of ontological concepts (named sets of property-value pairs) describing the meanings of word senses; and c) suggesting the location of the newly acquired concepts in the ontological network. The experimental environment described is also used for empirical validation of ontological property values in concepts originally encoded by knowledge engineers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology

Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...

متن کامل

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

Development of a Combined System Based on Data Mining and Semantic Web for the Diagnosis of Autism

Introduction: Autism is a nervous system disorder, and since there is no direct diagnosis for it, data mining can help diagnose the disease. Ontology as a backbone of the semantic web, a knowledge database with shareability and reusability, can be a confirmation of the correctness of disease diagnosis systems. This study aimed to provide a system for diagnosing autistic children with a combinat...

متن کامل

Development of a Combined System Based on Data Mining and Semantic Web for the Diagnosis of Autism

Introduction: Autism is a nervous system disorder, and since there is no direct diagnosis for it, data mining can help diagnose the disease. Ontology as a backbone of the semantic web, a knowledge database with shareability and reusability, can be a confirmation of the correctness of disease diagnosis systems. This study aimed to provide a system for diagnosing autistic children with a combinat...

متن کامل

Optimizing Membership Functions using Learning Automata for Fuzzy Association Rule Mining

The Transactions in web data often consist of quantitative data, suggesting that fuzzy set theory can be used to represent such data. The time spent by users on each web page is one type of web data, was regarded as a trapezoidal membership function (TMF) and can be used to evaluate user browsing behavior. The quality of mining fuzzy association rules depends on membership functions and since t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007